A Characterization of the Combined Effects of Overlap and Imbalance on the SVM Classifier

نویسندگان

  • Misha Denil
  • Thomas P. Trappenberg
چکیده

In this paper we demonstrate that two common problems in Machine Learning—imbalanced and overlapping data distributions—do not have independent effects on the performance of SVM classifiers. This result is notable since it shows that a model of either of these factors must account for the presence of the other. Our study of the relationship between these problems has lead to the discovery of a previously unreported form of “covert” overfitting which is resilient to commonly used empirical regularization techniques. We demonstrate the existance of this covert phenomenon through several methods based around the parametric regularization of trained SVMs. Our findings in this area suggest a possible approach to quantifying overlap in real world data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Condition Assessment of Metal Oxide Surge Arrester Based on Multi-Layer SVM Classifier

This paper introduces the indicators for surge arrester condition assessment based on the leakage current analysis. Maximum amplitude of fundamental harmonic of the resistive leakage current, maximum amplitude of third harmonic of the resistive leakage current and maximum amplitude of fundamental harmonic of the capacitive leakage current were used as indicators for surge arrester condition mon...

متن کامل

Verification of unemployment benefits’ claims using Classifier Combination method

Unemployment insurance is one of the most popular insurance types in the modern world. The Social Security Organization is responsible for checking the unemployment benefits of individuals supported by unemployment insurance. Hand-crafted evaluation of unemployment claims requires a big deal of time and money. Data mining and machine learning as two efficient tools for data analysis can assist ...

متن کامل

A DWT and SVM based method for rolling element bearing fault diagnosis and its comparison with Artificial Neural Networks

A classification technique using Support Vector Machine (SVM) classifier for detection of rolling element bearing fault is presented here.  The SVM was fed from features that were extracted from of vibration signals obtained from experimental setup consisting of rotating driveline that was mounted on rolling element bearings which were run in normal and with artificially faults induced conditio...

متن کامل

Combined application of computational fluid dynamics (CFD) and design of experiments (DOE) to hydrodynamic simulation of a coal classifier

Combining the computational fluid dynamics (CFD) and the design of experiments (DOE) methods, as a mixed approach in modeling was proposed so that to simultaneously benefit from the advantages of both modeling methods. The presented method was validated using a coal hydraulic classifier in an industrial scale. Effects of operating parameters including feed flow rate, solid content and baffle le...

متن کامل

SUBCLASS FUZZY-SVM CLASSIFIER AS AN EFFICIENT METHOD TO ENHANCE THE MASS DETECTION IN MAMMOGRAMS

This paper is concerned with the development of a novel classifier for automatic mass detection of mammograms, based on contourlet feature extraction in conjunction with statistical and fuzzy classifiers. In this method, mammograms are segmented into regions of interest (ROI) in order to extract features including geometrical and contourlet coefficients. The extracted features benefit from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1109.3532  شماره 

صفحات  -

تاریخ انتشار 2011